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Abstract. Phylogenetic networks are a generalization of phylo- 
genetic trees that are used in biology to represent reticulate or 
non-treelike evolution. Recently, several algorithms have been de- 
veloped which aim to construct phylogenetic networks from biolog- 
ical data using triplets, i.e. binary phylogenetic trees on 3-element 
subsets of a given set of species. However, a fundamental problem 
with this approach is that the triplets displayed by a phylogenetic 
network do not necessary uniquely determine or encode the net- 
work. Here we propose an alternative approach to encoding and 
constructing phylogenetic networks, which uses phylogenetic net- 
works on 3-element subsets of a set, or trinets, rather than triplets. 
More specifically, we show that for a special, well-studied type of 
phylogenetic network called a 1-nested network, the trinets dis- 
played by a 1-nested network always encode the network. We also 
present an efficient algorithm for deciding whether a dense set of 
trinets (i.e. one that contains a trinet on every 3-element subset 
of a set) can be displayed by a 1-nested network or not and, if so, 
constructs that network. In addition, we discuss some potential 
new directions that this new approach opens up for constructing 
and comparing phylogenetic networks. 
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1. Introduction 

Phylogenetic networks are a generalization of phylogenetic trees that 
are used in biology to represent reticulate or non-treelike evolution (cf. 
fl2\ [23] for recent overviews). There are various types of phylogenetic 
networks, but in this paper we shall focus on phylogenetic networks 
that explicitly represent the evolution of a given set of species. Such 
networks (whose formal definition is presented in Section [2]) can be 
essentially regarded as directed acyclic graphs having a single root, 
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whose internal vertices represent ancestral species and whose leaves 
represent the set species (see e.g. Fig. [T]). They have been used, just to 
name a few examples, to represent the evolution of viruses [28], bacteria 
[25], plants [22], and fish [2Q]. 

Recently, several algorithms have been developed which aim to con- 
struct phylogenetic networks (cf. [12l[23]). However, as stated in [121 
p.xi], "While there is a great need for practical and reliable computa- 
tional methods for inferring rooted phylogenetic networks to explicitly 
describe evolutionary scenarios involving reticulate events, generally 
speaking, such methods do not yet exist, or have not yet matured 
enough to become standard tools" . 

Probably one of the main reasons for this is that we do not yet have 
a very good understanding of how to build up complex phylogenetic 
networks from simpler structures. An important case in point is the 
construction of phylogenetic networks from phylogenetic trees. Even 
though there has been a great deal of recent work on this problem (cf. 
[121 Chapter 11], [221 Section 2]), especially concerning the construction 
of networks from triplets (i.e. binary phylogenetic trees with three 
leaves) [IDl [IB [131 IH [ISl III ED] ) , there is a fundamental obstacle to 
this approach: The trees displayed by a phylogenetic network do not 
necessarily determine or encode the network [TOj (even on 3 species - 
see e.g. Fig. [1]) and, in fact, we do not even know when a phylogenetic 
network is uniquely determined by all of the trees that it displays ^32j . 

As an alternative approach to tackling the problem of construct- 
ing phylogenetic networks, in this paper we shall investigate the fol- 
lowing strategy: Instead of constructing phylogenetic networks from 
trees, try to build them up from (simpler) phylogenetic networks. More 
specifically, we investigate how to construct phylogenetic networks from 
trinets, that is, phylogenetic networks having just three leaves (see, for 
example, the networks A^^i and N2 in Fig. [1]). 




X y z I. y s X y : x y x 



Figure 1. Two distinct phylogenetic networks A^^i and 
with leaf set {x, y, z} that display the same set 
{Ti,T2} of phylogenetic trees. In particular, neither of 
these two networks is encoded by this set of trees. 
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One of the main difficulties that we had to overcome before being able 
to put this strategy into practice was to find an appropriate definition 
for the set of trinets that is displayed by a phylogenetic network (see 
Definition 13. ip . However, with this definition in hand, we are able to 
show that any 1-nested network - a quite simple and well-studied type 
of phylogenetic network [7] - is always encoded by the set of trinets 
it displays (Theorem 16. 3p . Moreover, using this fact, we provide a 
polynomial-time algorithm for deciding whether a given dense set of 
trinets (i.e. one that contains a trinet on every 3-element subset of a 
set) can be displayed by a 1-nested network or not and, if so, constructs 
that network (see Fig. UJJl and Theorem 17. 3p . 

We now describe the contents of the rest of the paper. In Section |2] 
we introduce some relevant, basic terminology concerning phylogenetic 
networks. In Section [3] we define the rather natural concept of a recov- 
erable network, and show that, although a phylogenetic network need 
not be recoverable in general, a 1-nested network always is. In the 
following section, we show that a recoverable phylogenetic network is 
1-nested if and only if all of its displayed trinets are 1-nested (Theo- 
rem Using this fact and certain operations on 1-nested networks 
that are closely related to those presented in [7] and that are presented 
in Section |5l we then establish Theorem 16. 3 1 in Section [61 As a corollary, 
we obtain a new (and efficiently computable) proper metric on the set 
of 1-nested networks all having the same leaf set (see Corollary 16. 4p . In 
Section [7] we present our main algorithm for checking whether or not a 
dense set of trinets is displayed by a 1-nested network. We conclude in 
Section |H] with a discussion on some possible future directions, including 
some ideas about how trinets might be used in practical applications. 

2. Preliminaries 

For the rest of this paper, X is a non-empty, finite set (which will 
usually correspond to a set of species or organisms). For consistency, 
we follow the notation presented in [7] where appropriate. 

An rDAG N = {V,A) is a directed acyclic graph (DAG) with non- 
empty vertex set V = V{N), non-empty arc set A = A[N) (with no 
multiple arcs) and single root p = pN (i-e. a DAG with precisely one 
source p). We let <Ar denote the usual partial order on V induced by 
A^. The underlying graph of A^ is denoted N_. A cycle in iV is a subset 
C = {vi,V2, . . ■ ,Vn} C V{N_), n > 3, such that G E{N_) for 

all 1 < 2 < n — 1 and {vi,Vn} G E(N). If C is some cycle in N_ and 
there is some v ^ w & V so that the union of all of the arcs in A^ 
having both vertices in C is the union of two directed paths in A^ that 
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both start at v and end at w, then v (w) is called the split (end) vertex 
of C. We denote an arc a & A with tail x (= tail{a)), and head y by 
{x,y). We call {x,y) a cut arc (of N) if the removal of the edge {x,y} 
from E(N) disconnects N_. A vertex f G is a called a leaf of N if 
indegreeiv) = 1 and outdegree{v) = 0. We denote the set of leaves 
of N by L{N). Every vertex of that is neither the root pn nor has 
outdegree is called an interior vertex of A^. A tree vertex f G is an 
interior vertex of A^ with indegree{v) = 1, and a hybrid vertex v G V is 
an interior vertex with indegree{v) > 2. Note that neither the root 
nor a leaf of A^ is a tree vertex and that a hybrid vertex of A^ cannot 
be a leaf. 

Now, an X-rDAG is an rDAG A^ = (V, v4) with leaves uniquely 
labeled by the elements in X (i.e. there is a map (p^ : X ^ V such that 
maps X bijectively onto L{N)). We will usually just assume L{N) = 
X in case the labeling map is clear from the context. A phylogenetic 
network N = {V, A) (on X) is an X-rDAG such that every tree vertex 
has outdegree at least 2 and every hybrid vertex has outdegree at least 
1. If X is such a network and N' = {V, A') is a phylogenetic network 
on a non-empty finite set Y, then N is isomorphic to N' if there is a 
bijection ^ : X ^ Y and a directed graph isomorphism l : V ^ V 
between N and N' such that (pN' = ^ o ° In particular, in case 
Y = X we consider X as being a subset of both V and V, and hence 
N is isomorphic to N' if and only if l restricted to X is the identity 
map on X. 

A phylogenetic network N = (V, A) on X is 

• a bush (on X ) ii it is isomorphic to the phylogenetic network 
with vertex set = X U {v}, v ^ X, and arc set A = {{v,x) : 
X G X}, 

• a two-leafed network (on X ) ii X = {x,y}, and N is isomor- 
phic to the phylogenetic network on X with vertex set V = 
{u, V, w, X, y} and arc set A = {{u,w), {u, v), {v, w), (f , x), {w, y)}, 

• binary if all of its hybrid vertices have indegree 2 and outdegree 
1 and all of its tree vertices have outdegree 2, 

• 1 -nested if every pair of cycles in N_ intersect in at most 1 ver- 
teM, 

• a galled tree if every pair of cycles in N_ is disjoint, 

• a (rooted) phylogenetic tree if iV is a tree, and 

• a trinet if |L(X)| = |X| = 3. 



Note that in [Tj , 1-nested networks are defined in such a way that every hybrid 
vertex has indegree 2 - we do not make this assumption, but we will use the same 
name rather than introducing another term. 
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X y z X y z -x y z x y z x y z 
Ti{x,y,z) Ni[x,y,z) Ni{x,y.z) N3{x,y,z) N4(x,'y,z) 




X y z X tj z X y z x y z x y z 



NB{x,y,z) Ns{x,y,z) N,(x,y,z} T2{x,y,z) Ni(x,y,z) 

AAA A 

x y z X y z x y z x y z 
Ns(x,y,z) Nja{x,y,z) Nii{x,y,z) Ni^(x,y,z) 

Figure 2. The fourteen possible non-isomorphic, 1- 
nested trinets on the set {x,y,z}. Directions on arcs 
are omitted for clarity; internal vertices indicated with a 
dot are all hybrid vertices. Leaves that are at the bot- 
tom of a trinet are indicated with large dots and vertices 
hanging off the side of a trinet with a square. 

Note that a 1-nested network on X with |X| = 1 is a bush with arc 
set consisting of precisely one arc, and if |X| =2 then N is isomorphic 
to either a two-leafed network or a phylogenetic tree with 2 leaves. 

In Fig. m we picture the set of all possible non-isomorphic 1-nested 
trinets on {x,y,z}. If X is a 1-nested trinet on X, \X\ = 3, that is 
not isomorphic to a phylogenetic tree on X, then we say that t G X is 
at the bottom of N if it corresponds to one of the vertices represented 
by larger dots in Fig. [21 and we say that t hangs off the side of N if 
it corresponds to one of the vertices represented by a square in that 
figure (note that, in particular, there may be more than one element 
at the bottom of a trinet). 

Finally, let T denote a non-empty set of trinets such that L{T) G (^) 
for all T G T (which we shall also call a trinet set (on X) for short). 
If F C X, \Y\ > 3, we let 7y be the subset of T consisting of those 
trinets T eT with L{T) C Y . In addition, we call T dense (on X) if 
(f)={L(X) : XgT} and \r\ = {^f). 

3. Trinets and recoverable networks 

In this section, we investigate networks that display only 1-nested 
trinets. In particular, we show that even if every trinet displayed by a 
network N is 1-nested, it does not necessarily follow that N is 1-nested. 
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In addition, we shall introduce a rather natural condition on (that 
it is a 'recoverable network') for which this statement does in fact hold 
(see Theorem 14.31 in the next section). 

Suppose = {V,A) is a phylogenetic network on X, \X\ > 3, and 
F is a non-empty subset of — {pn}- Let v(Y) to be the last vertex in 

Y ~Y that lies on all paths in N from pat to every y Note that if 

Y consists of a single vertex y, then v{{y}) is known as the immediate 
dominator of y [2T] (see also [121 P- 143] where it is called the lowest 
stable ancestor ofy)). 

We now present a key definition (see also Fig. [3]): 

Definition 3.1. Given a phylogenetic network N on X and some 

Y G (^), we define the trinet on Y displayed by to he the trinet 
Ny with leaf set Y which is obtained from N by first taking the net- 
work N consisting of the union of all directed paths in N starting at 
v{Y) and ending at some element in Y , and then repeatedly first (i) 
suppressing all vertices v with indegree{v) = outdegree{v) = 1, and 
then (a) suppressing all multiple arcs that might result, until a trinet 
on Y IS obtained. Put Tr{N) = {Ny : Y e (f)}. 

Given a phylogenetic network on X, we say that a trinet set T on 
X is displayed by A^ if T C Tr{N). Moreover, we say that T encodes 
A^ if T C Tr{N) and, if A^' is any other phylogenetic network on X 
with T C Tr{N'), then A^' is isomorphic to A^. 

Note that in Definition 13.11 it is necessary to consider (at least) 3- 
element subsets of X, since if 'binets' are defined in a similar way for 
2-element subsets, then the resulting set would not in general encode 
the network (even if the network is a tree). Also note that we do not 
define a trinet on Y displayed by A^ to be the network consisting of the 
union of all directed paths in A^ to the elements of Y as this can result 
in networks with vertices having in- and outdegree 1, that is, networks 
that are not phylogenetic networks. 

The proof of the following lemma is straight-forward and is omitted: 

Lemma 3.2. Suppose that N is a 1-nested network on X , \X\ > 3. 
Then any element in Tr{N) is isomorphic to one of the fourteen trinets 
on {x, y, z} presented in Fig. [H 

Remark 3.3. If N is a 1-nested network on X, \X\ > 3, then N is 
binary if every element in Tr{N) is isomorphic to either Ti{x, y, z) or 
one of Ni{x,y, z) , I < i < 7 . Moreover, binary level-1 networks and 
galled trees (as defined in [7]j can be characterized in a similar manner. 

Now, suppose that A^ is a phylogenetic network on X such that every 
trinet in Tr{N) is isomorphic to one of the fourteen trinets presented 



ENCODING AND CONSTRUCTING 1-NESTED NETWORKS 



7 




Figure 3. (a) A phylogenetic network on X = 
{xi, . . . ,Xn, z,yi, . . . ,yn}, n > 1. (b) The subnetwork 
A^' obtained by taking the union of the directed paths 
from v{Y) to every element in Y = {x2,yn, z}. (c) The 
subnetwork A^" obtained from A^' by suppressing all mul- 
tiple arcs of A^'. (d) The trinet obtained from A^" by 
suppressing all vertices v G V{N") with indegree{v) = 
outdegree{v) = 1. Directions of arcs are omitted when 
clear. 




Figure 4. A phylogenetic network A^ on {x, y, z} for 
which Tr{N) consists of precisely the trinet Ti{x,y,z) 
but A^ is not a phylogenetic tree on {x, y, z}. As before, 
directions are omitted for clarity when clear. Also only 
the vertices that are leaves are marked by a dot. 

in Fig. [2J It is tempting to think that this should imply that A^ is 
1-nested. However, this is not the case. For example, even if A^ is 
a phylogenetic network such that every trinet in Tr{N) is isomorphic 
to either Ti{x,y,z) or T2{x,y,z) in Fig. |21 then A^ is not necessarily 
isomorphic to a phylogenetic tree (see e.g. Fig. H]). Even so, as we 
shall show in the next section (see Theorem 14. 3p . the aforementioned 
statement is almost correct. 

To this end, we now introduce a special class of networks. Suppose 
that A^ is a phylogenetic network on X with |X| > 3. We say that 
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a vertex v G V{N) is reachable from a vertex w G V{N) — f if there 
exists a directed path in starting at w and ending in f . In addition, 
if 2; G V{N) is a vertex of that hes on that path then we say that 
V is reachable from w by crossing z. We denote by v*^ G V{N) the 
(necessarily unique) vertex in for which there exist some distinct 
x,y & X with vlf = v{{x,y}) and, for all {u,v} G (^2^'') — {x,y}, 
either v^j = v{{u, v}) holds or v{{u, v}) is reachable from v^. 

Now, we say that is recoverable if = v]^. We use the term 
recoverable, since for biological data it would not be possible to infer 
the structure of the network above in case A^ is not recoverable, 
as there would be no way to 'detect' vertices above v]^ using any pair 
of elements in X. As an illustration, the vertex v in the phylogenetic 
network A^ on {x, y, z} pictured in Fig. |l]is the vertex = v{{x, z}). 
Since 7^ piy, N is not recoverable. 

We now characterize recoverable networks A^ on X, |X| > 3, in terms 
of a special type of vertex. A vertex v G ^(A^) is a cut vertex of A^ 
if the deletion of v (plus its incident edges) from N_ disconnects N_. 
We denote the resulting graph by N\v. If, in addition, there exists a 
connected component K of N\v such that V{K) fl L{N) = then we 
call V a separating vertex of N. For example, in Fig. is a separating 
vertex of A^ whereas vertex w is a cut vertex of A^. 

Proposition 3.4. Suppose N is a phylogenetic network on X , \X\ > 3. 
Then the following statements hold: 

(i) If N is not recoverable then is a cut vertex of N. 

(ii) A^ is recoverable if and only if v*^ is not a separating vertex of 
N. 

Proof, (i) Note first that p]^ ^ v*j^ as N \s not recoverable. Let x,y & X 
distinct such that = v{{x, y}), and assume for contradiction that 
is not a cut vertex of A^. Then there must exist some leaf I G L{N) of 
N that is reachable from p^ without crossing v^. Hence, there exists 
some z G {x,y} such that 7^ v{{l,z}) and is reachable from 
f ({/, z}); a contradiction. Thus, is a cut vertex of A^. 

(ii) We prove the contrapositive of the statement i. e. we show that 
N is not recoverable if and only if is a separating vertex of A^. 
Suppose {x,y} G (2) such that v]^ = v{{x^y}). Assume first that A^ 
is not recoverable. Then p^ 7^ v^j- and, by (i), is a cut vertex of A^. 
Hence, for every leaf / G L{N) of N, every directed path from p^ to 
/ must cross v'^. Let Kp^ denote the connected component of A[\t>^ 
that contains piy in its vertex set. Then V{Kp^) nL{N) = 0. Thus 
is a separating vertex of A^. 
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Conversely, suppose that is a separating vertex of A^. Then is 
a cut vertex of N and so every directed path from p^r to a leaf z G L{N) 
of N must cross f^. If were recoverable then Pn = v*^ would follow, 
implying that every vertex in must lie on a directed path from v*^ to 
a leaf of A^. But then V{K)r\L{N) ^ for every connected component 
K in N\v%\ a contradiction. Thus, A^ cannot be recoverable. □ 

It immediately follows that 1-nested networks are always recoverable: 

Corollary 3.5. Suppose N is a phylogenetic network on X , \X\ > 3. 
If N is 1-nested, then N is recoverable. 

Proof. Suppose for contradiction that there exists a 1-nested network 
A^ on X that is not recoverable, that is, pn v*^. Then, by Propo- 
sition EH^i) , v*j^ is a cut vertex of A^. Hence, for every leaf / G L{N) 
of A^, every directed path from p^r v*^ to I must cross v*^. Since 
outdegree{pN) > 2 and A^ cannot have multiple arcs, it follows that 
there exist (at least) 3 distinct directed paths in A^ from p^r to w^. 
But then there must exist two cycles in N_ which intersect in at least 2 
vertices; a contradiction. □ 

4. 1-NESTED TRINETS IMPLY 1-NESTED NETWORKS 

In the last section, we proved that if A^ is a 1-nested phylogenetic 
network on X, \X\ > 3, then A^ is recoverable. We shall now prove 
that if all of the trinets displayed by a recoverable network are 1-nested, 
then the network is 1-nested (Theorem I4.3p . 

To this end, suppose that A^ is a phylogenetic network on X, |X| > 3, 
and that C is a cycle of N_. Put 

Z{C) = {v E C : there exist {a, a'} ^ ( 2 ) tail{a) = tail{a') = 

Clearly, Z{C) ^ 0. 

Now, suppose I G L{N) is a leaf of N that is reachable from a hybrid 
vertex of N. We denote by p{l) the number of distinct directed paths 
in N from pjv to /. Clearly p{l) > 2. Moreover, we denote by w{l) the 
unique vertex of N distinct from / that simultaneously lies on every 
directed path from p^r to / such that (i) w{l) is a hybrid vertex of N, 
and (ii) there is a unique directed path from w{l) to I such that every 
interior vertex of N on this path is a tree vertex of N. To illustrate 
these definitions, consider the network N on {x, y, z} depicted in Fig. HI 
Then w{y) is the unique hybridization vertex of N and p{y) = 3. 

We now prove some useful, but somewhat technical, results concern- 
ing the set Z{C). 
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Figure 5. The situation considered in the proof of 
Proposition 14.11 The vertices in C which have two of 
their incoming (outgoing) arcs contained in A{C) are 
marked with squares (triangles). The leaves of plus 
the vertex v{{zi, . . . , Zm} are marked by dots. For clar- 
ity all other vertices are not marked. The directed lines 
represent directed paths rather than arcs. 

Proposition 4.1. Suppose N is a recoverable phylogenetic network on 
X , \X\ > 3, such that every trinet in Tr{N) is isomorphic to one of 
the fourteen trinets on {x,y,z} depicted in Fig. [3 Then \Z{C)\ = 1, 
for all cycles C in N_. 

Proof. Suppose for contradiction that N_ contains a cycle C with m := 
|-Z^(C)| > 2. Put Z{C) = {zi, . . . , Zm}- Since C is a cycle in N_ there 
must exist distinct vertices hi & C, 1 < i < m, such that, for all 
1 < i < m, two of the incoming arcs of hi are contained in A{C) and 
hi can be reached from Zi and from Zi+i, 1 < i < m, where we define 
Zm+i := Zi- Moreover for each such vertex hi there must exist a leaf 
li e L{N) of that is reachable from hi. Note that some of the leaves 
li might be the same (see Fig. [5] for a representation of the generic 
situation in which all leaves 1 < i < m, are distinct). 

Choose some i G {1, . . . ,m}, say i = I, and let a be the ordering 
li,l2, ■ ■ ■ Jm of the leaves Ij, I < j < m induced by C via the vertices hi, 
1 < i < m. If there exist at least three distinct leaves in that ordering, 
then let li-^Ji^Ji^ denote the first three distinct leaves in a. Note that 
li = li^ and each of li^, and is reachable from v{{zi, Z2, ■ ■ ■ , Zp}) G 
V{N), where p G {1, . . . , m} is such that for all is < q < p we have 
Ig = kg. But then p > 3 and the trinet A^' on {/j^, k^, k.^} displayed by 

contains f ({-22, • • • , Zp}) in its vertex set if p ^ m and, otherwise, the 
vertex v{{zi, Z2, . . . , Zp = Zm})- Hence, if p m then zj G V^(A^'), 2 < 
j < p — 1, and otherwise, Zj G V^(A^') with j = 1, . . . ,m. Consequently, 
contains two cycles that intersect in a path of length 1 or more in 
each case. Since, by construction, each cycle is the union of two directed 
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paths in that have the same start and end vertex this imphes that 
N' is not of the specified form, a contradiction. 

Now, if there exist just two leaves /j^ and in a that are distinct, 
then choose some / e L{N) — {li^^Ji^}, which must exist as \X\ > 3. 
Since each of /j^ and is reachable from f ({^i, Z2, ■ ■ ■ , Zm}) G ViN) it 
follows that v{{zi, Z2, . . . , Zm}) is a vertex in the trinet N' on {/, /j^, U^} 
displayed by A^. But then zj G V{N'), 1 < j < m which implies that 
N' contains two cycles that intersect in a path of length at least 1. As 
before, this yields a contradiction. 

So suppose that k = Ij for all i,j G {1, . . . ,171}. Let Lw{i-^) C L{N) 
denote the set of leaves of A^ that are reachable from w{li). We claim 
that w{li) is not a cut vertex of A^. Suppose for contradiction that w{li) 
is a cut vertex of A^. Then, since A^ is recoverable, there must exist a 
leaf I G L{N) — L^„(/^) that is reachable from without crossing w{li). 
Choose some /' G L{N) — {h,l}, which must exist as |X| > 3. Since 
li is reachable from each of Zj, I < j < m, v{{zi, Z2, ■ ■ ■ , z^}) G V{N) 
must be a vertex in the trinet A^' on {/, li} displayed by A^. But then 
Zi G V(N'), 1 < i < m, and so we obtain a contradiction as before. 
Thus, w{li) cannot be a cut vertex of A^, as claimed. 

Thus, there must exist some leaf I G L.u}{i^) that is reachable from 
Pn without crossing w{li). But then li 7^ /, by the definition of w{li). 
Arguments similar to the ones used in the previous case can be now 
used to obtain a final contradiction. Thus, |2'(C)| = 1 must hold for 
every cycle C of iV. □ 

To establish Theorem 14.31 we will use one further result that follows 
from the last proposition. Suppose A^ is a phylogenetic network on X, 
|X| > 3, and C is a cycle in N_ with |2'(C)| = 1. Then we denote the 
unique vertex in C that has two of its incoming arcs contained in A[C) 
by he- 

Corollary 4.2. Let N he a recoverable phylogenetic network on X, 
\X\ > 3, such that every trinet in Tr{N) is isomorphic to one of the 
fourteen trinets on {x, y, z} depicted in Fig. O Let Ci and C2 denote 
two distinct cycles of N_ for which A{Ci) fl A{C2) 7^ holds, and let 
I G L{N) denote a leaf of N that is reachable from both hc^ and hc2- 
Then w{l) is not a cut vertex of N. 

Proof. Suppose for contradiction that this is not the case, that is, there 
exists a recoverable phylogenetic network N on X, two distinct cycles 
Ci and C2 in N with A{Ci) n ^(^2) ^ 0, and a leaf / G L{N) of N 
that is reachable from hc^ and from but that w{l) is a cut vertex 
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of A^. Since is recoverable, there must exist a leaf /' G -^^(A^) — {/} 
of that is reachable from p^v without crossing w{l). 

Now let Zi G V"(A^) denote the unique vertex in Z{Ci) i = 1,2. Note 
that Zi = Z2 might hold. Since / is clearly also reachable from both 
Zi and Z2, there must exist a directed path from piy to v{{zi, Z2}) that 
crosses v{{l,l'}). Choose some /" G ^(A^) — {1,1'} which must exist 
as |X| > 3. Then the trinet A^' on {/,/',/"} displayed by A^ contains 
the vertex v{{zi,Z2}) and thus every arc in A{Ci) U A{C2). Since 
A{Ci) nA{C2) 7^ it follows that A^' is not of the specified form which 
is impossible. □ 

We now prove the main result of this section: 

Theorem 4.3. Suppose that N is a recoverable phylogenetic network 
on X , \X\ > 3. Then N is 1-nested if and only if every trinet in Tr{N) 
is isomorphic to one of the fourteen trinets depicted in Fig. [H 

Proof. If A^ is 1-nested then, by Lemma [3.21 the trinets in Tr{N) are 
of the specified form. 

Conversely, suppose that the trinets in Tr{N) are of the specified 
form. Assume for contradiction that A^ is not 1-nested. Then there 
must exist two cycles Ci and C2 in N_ which intersect in more than 
one vertex. Moreover, amongst all such pairs of cycles, there must 
exist a pair Ci and C2 for which the following holds: There is a path 
P with V{P) C Ci n C2 which has an end vertex X2 G V{P) such 
that the edge {xi, X2} G E{P) is the arc (xi, X2) in A{N) and {y, X2} ^ 
E{Ci)r\E{C2), for ally G (CinC2) -{a^i, X2}. Choose some Zi G Z{Ci), 
i = 1,2 and note that, by Proposition 14.11 |Z(Cj)| = 1. However note 
that Zi = Z2 might hold. 

Let li G L{N) denote a leaf of A^ that is reachable from hi = h^, 
i = 1,2. Then one of the three generic cases (a) - (c) pictured in Fig. [6] 
must hold. Note that in the case of (b) and (c) we can choose /i to 
equal I2 since in case of (b) we have X2 = h2 and in case of (c) we have 
X2 = h2 = hi. 

Suppose first that Case (a) holds. We begin by considering the case 
li = h- Since A^ is recoverable. Corollary 14. 21 implies that w{li) is a not 
a cut vertex of A^. Let L^(i-^^) C L{N) denote the set of leaves of A^ that 
are reachable from w{li). Then there must exist a leaf / G of A^ 

that is reachable from pjv without crossing w{li). By the definition of 
w{li), li 7^ /. Since / is reachable from Zi and from Z2, there must exist 
a directed path from to v{{zi,Z2}) that crosses v{{li,l}). Choose 
some /' G L{N) — {/i, /}, which must exist as |X| > 3. Then the trinet 
A^' on displayed by A^ contains the vertex v{{zi, Z2}). Thus, 
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p. 





Figure 6. The three generic cases considered in the 
proof of Theorem 14.31 The vertices in Cj, i = 1, 2 which 
have two of their incoming (outgoing) arcs contained in 
A{Ci), i = 1,2 are marked with squares (triangles). The 
leaves of N plus the vertices piy, Xi, X2, v{{zi,Z2}) and 
w{li), i = 1,2 are marked by dots. For clarity all other 
vertices are not marked. The directed lines represent 
directed paths rather than arcs. 

contains two cycles that intersect in the edge {xi,X2}- Since each 
cycle is the union of two directed paths in that have the same start 
vertex and the same end vertex, it follows that A^' is not of the specified 
form, a contradiction. Thus /i ^ I2 must hold. 

Since |X| > 3, we may choose some / G L{N) — {li,l2}- But then 
similar arguments applied to the trinet A^' on {li, I2, 1} displayed by 
yields a contradiction. 

Similar arguments can be used to show that Case (b) and Case (c) 
lead to a contradiction. But this implies that there cannot exist two 
distinct cycles of N_ that intersect in more than one vertex. Thus, A^ 
must be 1-nested. □ 

As a corollary we see that if all of the trinets displayed by a re- 
coverable phylogenetic network are trees then the network must be a 
tree. 

Corollary 4.4. Suppose N is a recoverable phylogenetic network on 
X, \X\ > 3. Then N is a phylogenetic tree on X if and only if every 
trinet in Tr{N) is isomorphic to either the trinet Ti{x,y,z) or the 
trinet T2{x , y , z) on{x,y,z}. 

Proof. This is an immediate consequence of Theorem 14.31 and the fact 
that if A^ is recoverable 1-nested network on X then N_ contains a cycle 
if and only if there exists a trinet A^' G Tr{N) such that contains a 
cycle. □ 
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5. Cherries, cactuses and reductions 



In the next section, we shall show that the set of trinets displayed 
by a phylogenetic network encode the network. To do this, we will 
use some operations that can be performed on 1-nested networks to 
produce new 1-nested networks which we shall now introduce. These 
operations are very closely related to the "i?, T and G-operations" 
presented in [71 Section 4]. In consequence, we shall omit the proofs of 
the results that we state concerning our operations, instead citing the 
related results in (TJ Section 4] which have very similar proofs. 

Suppose = iV,A) is a 1-nested network on X, \X\ > 2. We 
call a subset C X a cherry of if IS"! > 2 and there is some 
vs & V such that {vs,x) G A for all x G 5 and {vs,x) ^ A for all 
X G X — S* (see Fig.[7](a)). Moreover, we shall call such a cherry isolated 
if outdegreeiys) = \S\ and indegree{vs) = 1 (see Fig. [7|(b)). Note that 
if S" is a cherry of N and S = X, then N is isomorphic to a bush on 
X. We now define a related concept. If |X| > 2, we call a tuple H = 
(ai, 02, . . . , : bi,b2, . . . ,bg : z) of distinct elements of X with p > I, 
g > a cactus of N (with support S = {oi, 02, ... , ttp, 61, 62, • • • , bg, z}) 
if there is cycle Ch in N_ with split vertex vh such that the network 
induced by N on Ch U S* is as pictured in Fig. [7](c) (note that if g = 0, 
we take the tuple to be if = (ai, 02, . . . , Op : : 2;)). Moreover, such a 
cactus H is called isolated if indegreeiyu) = 1 and outdegree{vH) = 2 
(see Fig. O^d)). Note that a two- leafed network on a set of size two is 
a cactus. 

Now, suppose that N is 1-nested network on X, |X| > 2. In case 
there is a non-isolated cherry S of N and z & S, then we define a 



Figure 7. (a) A cherry S = {xi,X2, ■ . . ,Xm}, m > 2, 
(b) an isolated cherry S = {xi,X2, ■ ■ ■ ,Xm\, m > 2, (c) 
a cactus H = (oi, 02, . . . , : bi,b2,...,bg : z), p > 
1, g > 0, and (d) an isolated cactus H = (oi, 02, . . . , ctp : 
bi,b2, . . . ,bq : z), p > l,q > 0. Note that the arcs ending 
at vs and vh in (a) and (c) do not necessarily exist. 
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cherry reduction C = Cz-s on to be the network Cz:s{N) which is 
obtained by removing all leaves in S except z from N , together with 
their incident arcs. In addition, if S is an isolated cherry of and 
z e 5", then we define an isolated cherry reduction C = Cz-.s on to 
be the network Cz:s{N) which is obtained by removing all leaves in S 
from A^, together with their incident arcs, and replacing the vertex vs 
by z, which now becomes a leaf of the new network. 

Similarly, suppose there is a cactus H = (ai, 02, . . . , ctp : 61, 62, • • • , &g : 
z) of A^ with support S. If H is not isolated, then we define a cac- 
tus reduction H = Hz.s = Ha^ a2,...,apM,h2,...,hq:z on A^ to be the net- 
work Haj^^a2,...,ap:bi,b2,...,bq:z{N) which is obtained by removing the ver- 
tices {Ch — {vh}) U (S* — {z}), together with their induced arcs plus 
the two outgoing arcs of vh contained in A{C), from A^ and then adding 
in the new arc [vh, z). In addition, if H is isolated, then we define an 
isolated cactus reduction H = Hz-s = Ha^^a2,...,ap:bi,b2,...,bq:z on A^ to be 
the network Ha^^a2,...,aj,:bi,b2,...,bq:z{,N) which is obtained by removing the 
vertices {Ch — {vh}) U (S* — {z}), together with their induced arcs plus 
the two outgoing arcs of Vh-, from A^ and replacing vh with z. 

It is^clear that the networks Cz:s{N), Cz:s{N), Hai,a2,...,ap:bi,b2,...,bq:z{N) 
and Haj^^a2,...,ap:bi,b2,...,bq:z{N) are all 1-nested networks on the set X — 
{S — {z}) and that they all have 15*1 — 1 less leaves than A^. Moreover 
we have: 

Proposition 5.1. [3, Proposition 2] Suppose that N is a 1-nested net- 
work on X , \X\ > 1. // |X| > 2, then at least one of the reductions C, 
C , H , H may be applied to N. Moreover, if none of the reductions C , 
C , H , H may he applied to N , then \X\ = 1 and N is the bush on X . 

We can also define 'inverses' of C^^, C ^, H^^, H ^ oi the reductions 
C, C, H, H as follows. Given a 1-nested network A^ on X, |X| > 1, 
a leaf z G X of A^, and a set finite 5* with 151 > 2 and S r\ X = {z}, 
we define the cherry expansion C~.\ of N to be the network C^.\{N) 
obtained by replacing leaf z by a new vertex f , and adding in new arcs 
(f , s) for all s G S". Clearly C~.\{N) is a 1-nested network on X U S". 

Isolated cherry, cactus and isolated cactus expansions 
corresponding to C, H and if, are defined in a similar way. 

It is straight-forward to see that a reduction and its corresponding 
expansion are mutual inverses, in that when one is applied to a 1- 
nested network X on X and then its inverse, we obtain a network that 
is isomorphic to N . Moreover, we have: 

Lemma 5.2. [7^, Lemma 4] Let N and N' be two 1-nested networks on 
X, \X\ > 3. If N and N' are isomorphic, then if one of the reductions 
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C, C, H, H (respectively, expansions C-\ C , H-\ H ) may he 
applied to N , then the same one may also he applied to N' and the two 
resulting 1-nested networks are isomorphic. 

6. Encoding 1-nested networks with trinets 

In this section we show that the set of (necessarily 1-nested) trinets 
displayed by a 1-nested network iV on X encodes (see Theorem I6.3p . 

We begin by describing how to characterize cherries and cactuses in 
a 1-nested network in terms of their trinets, starting with cherries. To 
this end, we associate to a trinet set T on X and a non-empty subset 
5* C X the trinet set 

r\s:= {NeT : 5nL(iV)^0}. 

Lemma 6.1. Suppose N = {V, A) is a 1-nested network on X , \X\ > 3, 
and let S ^ X with \S\ > 2. Let T he a non-empty suhset ofTr{N). 
Then S is a cherry of N with T = Tr{N)\s if and only if T satisfies 
the following properties: 

(CI) L{N') n ^ ^ 0, for all N' e T (or eqmvalently, T\s = T). 
(C2) For all {x^y} G (g) and all z G X — S, either Ti{x,y,z), 

T2{x,y,z), Ns{z,x,y), N^^x.y.z), Ng{x,y,z) or Nioiz,x,y) is 

in T ■ 

(C3) For all {x, y, z} G (f) , T2{x, y, z) G T. 

(C4) There is no S' '^X such that S C S' and T satisfies (C2) and 
(C3) with S replaced hy S' . 

Moreover, if this is the case and S ^ X (or, equivalently, \X — S\ > 1), 
then S is isolated if and only if T also satisfies: 

(C5) For all {x,y} G (2) and all z G X — S, either Ti{x,y,z), 
Ns{z,x,y) or N4{x,y,z) is contained in T. 

Proof. Suppose T = Tr{N)\s holds for some cherry S of N . Then it is 
straight-forward to check that T satisfies (C1)-(C4). 

Conversely, suppose T satisfies (C1)-(C4). Let v = v{S). Note 
that v{{x,y}) = v for all {x,y} G (2), since otherwise there would 
exist some z E S such that T2{x,y,z) ^ T, in contradiction to (C3). 
Moreover, suppose there were some z G X — S, x G S" with v{{z, x}) >n 
V. Let y S — {x} (which exists since 15*1 > 2). Then none of 
the trinets Ti{x,y,z), T2{x,y,z), Ns{z,x,y), Ni{x,y,z) Ng{x,y,z) or 
Nio{z,x,y) could be contained in T, in contradiction to (C2). Thus, 
for all 2; G X — S" and all x G S", we have v{{z, x}) <n v with possibly 
equality holding. It follows that {v,x) G A for all x G S. 
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Now, suppose there is some r E X — S with {v, r) G A. Let S' = SU 
{r}. Then it is straight-forward to check that, for all x e S' and all z e 
X — S', either Ti{x, r, z), T2{x, r, z), N^i^z, x, r), N^i^x, r, z), Nq{x, r, z) 
or A^io(^,a;, r) is in T, and that T2{x,y,r) G T for all {x,y} G (2)- 
This implies that S' satisfies (C2) and (C3) with S replaced by S', 
which contradicts (C4). In particular, it follows that S is a. cherry of 
N. 

To see that T = Tr{N)\s holds note first that T C Tr{N)\s is a 
consequence of (CI). To see that Tr{N)\s C T suppose N' G Tr{N)\s- 
Then L{N') fl 5* 7^ and so N' E T follows from considering the size 
of the intersection L{N') n 5" in conjunction with Properties (C2) and 
(C3). 

To complete the proof, suppose that X ^ S. First note that if S 
is an isolated cherry of N, then (C5) clearly holds. Conversely, if T 
satisfies (C5), then let v E V be the vertex with {v,x) G A for all 
x E S and (v, x) ^ A ior all x E X — S (which exists since 5" is a cherry 
by (C2)-(C4)). Then outdegree{v) = 15*1, since otherwise there would 
exist some {x,y} E (2) and z E X ~ S with z >n v such that either 
T2{x,y,z) or NiQ{z,x,y) E T, in contradiction to (C5). 

Now, since |X — 5*1 > 1, indegree{v) > 1. Suppose indegree{y) > 1. 
Then there must exist some z E X — S and {x, y} E (2) such that 
Ng{x,y,z) E T, which contradicts (C5). Therefore indegree{v) = 1, 
which completes the proof. □ 

We now present a similar result for cactuses. 

Lemma 6.2. Let N be a 1-nested network on X, \X\ > 3, and let 
H = (fli, . . . , ap : bi, . . . ,bq : z) be a tuple of distinct elements in X 
with p > 1 and q > 0. Put S = {ai, . . . ,ap,bi, . . . ,bq, z} and let 
T be a non-empty subset ofTr{N). Then H is a cactus of N with 
support S and T = Tr{N)\s if and only if, with A — {oi, . . . , Op} and 
B — {bi, . . . ,bq}, T satisfies the following properties: 

(HI) L{N') nS^^ for all N' E T (or, equivalently T\s = T). 
(H2) Ni{x,z,y) E T for all x E A, y E B. 

(H3) N2{z,x,x') E T for all x = ai,x' = aj, 1 < i < j < p, or 

X — br,x' — bg, 1 < r < s < q. 
(H4) Ti{x,x',x") E T for all x = ai,x' — aj,x" — Uk, 1 < i < j < 

k < p, or X = br, x' = bs,x" = bt, 1 < r < s < t < q. 
(H5) For all w E X—S either N^[z,x,w), or Nq{z,x,w), or Nr{w,x, z), 

or Ns{z,x,w), or Nii{z,x,w), orNi2{w,x,z) is contained in 

T, for all X E A or x E B. 



18 



K. T. HUBER AND V. MOULTON. 



(H6) For all w G X — S, either Ti{x,x' ,w), or N3{w,x,x'), or 

Ni{x, x', w) is contained in T , for all x ^ x' & A or x ^ x' & B. 
(H7) ForallwEX — S, one of Ti{x,y,w), Ti{x,w,y), Ti{y,w,x), 

T2{x,y,w), N4{x,y,w), and Ni{x,w,y) is contained in T, for 

all X & A and y & B. 
(H8) There exists no tuple H = {ci, . . . , Ct : di . . . , dg : z) of distinct 

elements in X , t > 1 and s > 0, with S C. S' :— {ci, . . . , Ct, di . . . , d, 

such that T satisfies (H2)-(H7) for S' . 

Moreover, if this is the case and S ^ X, then H is isolated if and only 
ifT also satisfies: 

(H9) For all w E X — S , T2{x, y, w) ^ T , for all x E A, y E B, and 
Ns{z, X, w) ^ T, Nu{z, x,w) ^ T and Ni2{w, x,z) ^ T for all 
X E A or X E B. 

Proof. Suppose H is a, cactus of N with support S and T = Tr{N)\s. 
Then it is straight-forward to see that T must satisfy (H1)-(H8). 

Conversely, suppose T satisfies (H1)-(H8) with A and B as specified. 
We claim that iJ is a cactus of with support S. We prove the 
claim for q = and remark that the proof for g > 1 is similar. Let 
X e A. If 15*1 = 2 then choose some w e X — S. By (H5), one of the 
trinets N^{z,x,w), Nq{z,x,w), N^j^WjX, z), Ns{z,x,w), Nii{z,x,w), 
Ni2{w,x,z) must be contained in T. But then H must clearly be a 
cactus of N (with support S). 

Assume that \S\ > 3. Then \A\ > 2 and, by (H3), N2{z, a^, aj) G T or 
N2{z, aj, ai) G T holds for all {i, j"} G (^^'2'^^). Since T C Tr{N), there 
must exist a cycle Ci^j in TV with split vertex Vi^j :— i'^. . and end vertex 

'■= bcij that gives rise to that trinet on {z, a^, aj}, {i,j} G (^^''2'^''^)- 
We show that Qj = Ck,i holds for all {i,j},{k,l} G (^^'2'^^). To see 
this it suffices to show that Cij = Ci^i holds for all i G {1, . . . ,p} and 
all {k,l} G ({I'-'rf-W). So assume for contradiction that there exists 
some i G {l,...,p} and some {j,l} G (^^^'■■■'P^^^'-^^ with ^ Ci^i. 
Without loss of generality assume that i — 1. 

Note that since N is 1-nested there must exist, for all t G {2, . . . ,p} 
and all x G {oi, at, z}, a unique last vertex f^'* in Ci^t that lies on every 
path from to x. Clearly, v^'* is the end vertex of Ci^t and vlf is 
neither the end vertex nor the split vertex of Ci_t, t G {2, . . . ,p}. Put 

We first show that bij = bi^i. Suppose for contradiction that bij 7^ 
bi^i. Then since indegree{z) = 1, there must exist a vertex y^ distinct 
from z that lies simultaneously on any path in from bij = v\'^ to z 
and on any path in N from hi^i — to z. Without loss of generahty, 
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we may assume that yz is as close to z as possible. So there must 
exist a cycle C in TV with {t;, j, 61 j, ^z^, 61 vi ;} C C with possibly 
V = Vij or V = or V = Vij = Vij or bij = or bi^i = holding. 
Since {vij, bi^} C C fl Cij and N is 1-nested this is impossible. Thus 
bi,j = bi^i, as required. 

Similar arguments with z replaced by a 1 in the definition of yz also 
imply that v^f — v^'^ must hold. But then Cij and Ci^i intersect 
in more than one vertex which is impossible as N is 1-nested. Thus 
Cij = Ci^i must hold for all j,l G {2,...,p}. Moreover, by (H4), 
vl'^ 7^ "^af fo^ "S} £ (^^' 2 ''''^) • Thus there exists a directed path P 

from vi^2 to 61^2 that crosses the vertices v^'^, vl'^, . . . , vl'^ in that order. 

To finish the proof of the claim that if is a cactus of N with support 
5", we next establish that V{P) = Y := {f^^^, f^^^, . . . , f^^^, f 1.2, &i,2}- 
Suppose for contradiction that this is not the case and that there exists 
some u G V{P) — Y. Without loss of generality, we may assume that 
{u,vl'^) G A{P). Since N is 1-nested, there exists some leaf w G 
L(N) — S that is reachable from u without crossing any further vertex 
in Ci^2- We distinguish the cases that X = S and that X ^ S. If 
X — S then this is impossible and so V{P) = Y, as required. Since 
Ci,2 is a cycle in TV and N is 1-nested it follows that (wi_2, 61,2) is an 
arc in N. But this implies that if is a cactus of N (with support S). 

So assume that S ^ X. Then (H6) applied to ai, a2, and w, 
combined with the fact that N is 1-ncstcd, implies that the trinct 
Ti{ai,a2,w) is contained in T. But then T satisfies (H2)-(H7) for the 
support 5" U {w} of the tuple H' — {w, ai, . . . ,ap : : z). In view of 
(H8), this is impossible. Thus, V{P) = Y, as required. 

We now show that (f 1,2, ^1,2) ^ ^(C'i,2)- Suppose this is not the case 
and there exists some u G Ci,2 — V{P). Without loss of generality we 
may assume {vi^2, u) G A{Ci^2)- Then there exists a leaf w G L{N) — S 
such that u is the last vertex in Ci^2 on any path from pj^ to w. But then 
the trinet on {w,ai,z} is not as specified in (H5) which is impossible. 
Thus, (vi, 2,^1,2) G ^(C'1,2), as required. It follows that H must be a 
cactus of N (with support S) in this case, too. 

To see that Tr{N)\s = T, let A^' G TriN)\s. Then L{N') n 5 ^ 0. 
By distinguishing the cases that \L{N') n 5*1 = 1, 2, or 3, it is straight 
forward to show that N' E T using Properties (H2)-(H8). Also T Q 
Tr{N)\s holds by Property (HI). 

It remains to show that if is a cactus of N with support S and 
S ^ X then H is isolated if and only if T satisfies (H9). Assume that 
H is a, cactus of A^ with support S and that S ^ X. Then it is straight 
forward to check that T satisfies (H9). 
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Conversely, assume that T satisfies (H9). We need to show that 
outdegree^vn) = 2 and that indegree^vn) = 1- We again prove the 
case q = and remark that the arguments for g > 1 are similar. Since 
if is a cactus of we clearly have outdegree{vH) > 2. Assume for 
contradiction that outdegree{vH) > 2. Then since is 1-nested and 
X ^ S there must exist some w ^ X — S that is reachable from 
Vh without crossing a vertex in Ch — {vh}, where Ch is the cycle 
in N_ corresponding to H. But then there exists some x E A such 
that N8{z,x,w) or Ni2{w,x, z) is contained in T contradicting (H9). 
Thus, outdegreeiyH) = 2, as required. But then indegreeivn) > 1 as 
S ^ X. Assume for contradiction that indegree{vH) > 1- Then since 
S X there must exist some w E X — S such that Nu{z, a,w) E T for 
some a E A contradicting again (H9). Thus, indegree{vH) = 1- This 
completes the proof of Lemma 16.21 □ 

Now, let Rz;s (Rj-s) denote any of the four reductions C, C, H, H 
with z and S as specified in the definition of the reductions. Then 
it is straight-forward to check that if A^ is a 1-nested network on X, 
|X| > 3, and 

%:s = {N' E Tr{N) : S n L{N') ^ and 5 H L{N') ^ {z}}, 

then 

(1) Tr{N)=Tr{R,.,s{N))U%.,s, 
or, in other words, Tr{Rz-s{N)) = Tr{N) — Tz-.s- 

Theorem 6.3. Suppose that N and N' are both 1-nested networks on 
X, \X\ > 3. Then Tr[N) = Tr{N') if and only if N is isomorphic to 
N'. 

Proof. Suppose first that N is isomorphic to N'. Then Tr{N) = 
Tr{N') follows immediately by using induction on |X|, Lemma 15.21 
and (dD. 

To prove the converse we also use induction on |X|. If |X| = 3, then 
the converse obviously holds. So, suppose that, for all 1 < |X| < m, 
m > 3, if Tr{N) = Tr{N') then N is isomorphic to N'. 

Let |X| = m + 1, and suppose that N and N' are 1-nested networks 
on X with Tr{N) = Tr{N'). By Proposition 15. II we can apply at least 
one of the reductions R = C,C, H, H to N. Therefore, since Tr(N) = 
Tr{N'), by Lemmas 16. II and 16.21 we may also apply the same reduction 
R to N'. Moreover, by (P we have Tr{R{N)) = Tr{R{N')). So, by 
induction, R{N) is isomorphic to R{N'). Therefore, by Lemma |5.2[ 
R~^{R{N)) is isomorphic to R~^{R{N')), i.e. N is isomorphic to N', 
as required. □ 
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There has been some interest in the hterature in defining metrics 
on networks [12, page 172], and various metrics have been defined for 
different types of phylogenetic networks including 1-nested networks 
[H El [6l [71 m [To]. Thus the following result could be of interest. For X 
with |X| > 3, let denote the set of 1-nested networks on X. In 

addition, define the map 

d : ATiiX) X ATiiX) M; {N,N') f-> d{N,N') := \Tr{N)ATr{N')\, 

for all A^, A^' G A/i(X). Then the last theorem immediately implies: 

Corollary 6.4. For X with \X\ > 3, the map d is a (proper) metric 
onUi{X). 

Note that the metric d can be efficiently computed since, for N e TVi, 
it is possible to compute every trinet in Tr{N) efficiently (essentially 
because for any Y G (^) the vertex v{Y) can be computed efficiently 
using, e.g. the algorithm presented in [2T]). 

7. Constructing 1-nested networks from dense sets of 

TRINETS 

In this section, we present an efficient algorithm which, given a dense 
set T of trinets, can decide whether or not it is displayed by a 1-nested 
network, and if this is the case, constructs the network displaying T 
(see Fig. [10]). 

We begin by describing efficient algorithms for detecting cherries and 
cactuses. Given a dense set T of trinets on X, we say that 5* C X, 
\S\ > 2 is a cherry of T if the set T\s satisfies conditions (C2)-(C4) 
(note that it necessarily satisfies (CI)), and that it is isolated if it also 
satisfies (C5). We now show that cherries can be found in polynomial 
time in a dense set of trinets using the algorithm presented in Fig. [HI 

Lemma 7.1. Given a dense set T of trinets on X , \X\ > 3, algorithm 
FindCherry is correct and has run-time that is polynomial in \X\. 

Proof. It is straight-forward to see that algorithm FindCherry has 
run-time that is polynomial in |X|. 

To see that algorithm FindCherry is correct, first note that it will 
clearly terminate. Now, suppose that the algorithm outputs a (non- 
empty) set S. Then, in view of line 7, T\s must satisfy (C2) and (C3). 
Moreover, in view of the while loop (lines 6-10) T|s must satisfy (C4), 
So S must be a cherry of T. Moreover, if the output indicates that S is 
isolated (i.e. that S ^ X and that T\s satisfies (C5)), then this must 
be the case in view of line 8. 
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Now, suppose that algorithm FindCherry outputs "No cherry of 
T exists", and that, for the purposes of contradiction, a cherry S* of T 
does exist. Then, as any cherry has cardinahty at least 2, if a cherry 
exists then at some stage the while loop in lines 2-12 must encounter 
some {x, y} G {^) with {x, y} C S. Clearly, the algorithm will then 
have to output S, a contradiction. Thus the algorithm FindCherry 
is correct. □ 



FindCherry(X,T) 



Input: A set X, |X| > 3, and a dense set T of trinets on X. 

Output: A cherry S of T, and a boolean variable / G {T, F}, with / = T 

if S is isolated and J = F else, or the statement "No cherry of T 

exists" . 



1. Let 5 = 0, / = F, G = (2). 

2. While there is some {x,y} G G do 

3. If Ti{x, y, z), T2{x, y, z), Ns{z, x, y), N4{x, y, z), Ng{x, y, z) 

4. or Niq[z, X, y) is contained in T for all z G X — {x, y} then do 

5. Let 5 = {x, y}, G = and f/ = X - {x, y}. 

6. While there is some m G ?7 do 

7. If T\s\j{u} satisfies (C2) and (C3), then let S' = S' U {u}. 

8. liU = {m}, 5 ^ X, and S satisfies (C5), then let / = T. 

9. heiU = U - {u}. 

10. end "do (line 6)" 

11. else let G = G'-{{x,i/}}. 

12. end "do (line 2)" 

13. If S' = then output "No cherry of T exists" else output S and /. 



Figure 8. Pseudo-code for an algorithm that either 
finds a cherry of a dense trinet set T and also checks 
whether it is isolated or not or determines that no cherry 
of T exists. 

Now, given a dense trinet set T on X, we say that a tuple H = 
(ai, a2, ■■■ ,ap : bi,b2, ■■■ ,bq : z) of distinct elements of X, p > 1, g > 
is a cactus of T (with support S = AU B U {z}, A = {ai, . . . , Up} and 
B = {hi, . . . , bq}) if T\s satisfies conditions (II2)-(H8) of Lemma 16.21 
(note that T\s necessarily satisfies (HI)). Moreover, such an H is 
isolated if S* 7^ X and T\s also satisfies condition (HQ) of Lemma [6^21 
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Note that ii H = (ai, 02, . . . , ap : 61, 62, • • • , &g '■ z) is a cactus of T, 
then the relation ~7- defined on the set Y = S — {z} = A U B hj 
putting y r^r y' if and only ii y = y' or N2{z, y, y') or N2{z, y', y) E T, 
for all y, y' G Y , is an equivalence relation on Y with (at most two) 
equivalence classes A, B. Moreover, the relation <j- defined on Y by 
y <r y' if and only if N2{z,y,y') G T, for all y,y' E Y, is a strict 
partial order on Y, which restricts to a strict linear order on A and 
also on B. 

Using these observations, we now show that the algorithm presented 
in Fig. M can be used to detect cactuses in a dense set of trinets in 
polynomial time. 

Lemma 7.2. Given a dense set T of trinets on X, |X| > 3, algorithm 
FindCactus is correct and has run-time that is polynomial in \X\. 

Proof. First note that the algorithm will clearly terminate. Moreover, 
if it does output a tuple then in view of lines 12 and 13 this must be 
a cactus of T and it will be isolated only if J = T. In addition, if the 
algorithm outputs "No cactus of T exists" , then this must be the case. 
Otherwise, suppose there is some cactus K = {ai, . . . , ap : hi, . . . ,hq : z) 
of T, p > 1, g > 0. Setting A = {ai, . . . , a^} and B = {61, ... , bq} it 
follows that S = AU B U {z} is the support of K and that z must be 
at the bottom of some trinet in T. Thus the while loop (lines 2-20) 
would eventually find z at line 3. Since i^' is a cactus of T, for each 
element y E Y := A U B , there exists some N E T such that y hangs 
off the side of N and z is at the bottom of N. Moreover, A and B 
(in case B ^ ^) are the equivalence classes of the relation ~7- defined 
on Y and the elements in A and B (again in case -B 7^ 0) are strictly 
linearly ordered by <f. Thus, the algorithm would form the tuple 
F = (ai, . . . , Op : bi, . . . ,bg : z) (lines 10 and 11). Clearly, the support 
of F is S. Since T\s satisfies (H2)-(H8) it follows that F is returned 
by the algorithm. However since F = K, this is impossible. 

Finally, to see that algorithm FindCactus is polynomial in |X|, it 
is sufficient to note that lines 6-7, 8-9 and 12-13 can all clearly be 
executed in time that is polynomial in □ 

We now use the algorithms FindCherry and FindCactus to show 
that it can be decided in polynomial time whether or not a dense set of 
trinets is displayed by a 1-nested network using the algorithm presented 
in Fig. [H 

Theorem 7.3. For X with |X| > 3 and T a dense set of trinets on 
X , algorithm BuildNet has run-time that is polynomial in \X\ and 
is correct. 
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FindCactus(X,T) 



Input: A set X, |X| > 3, and a dense set T of trinets on X. 

Output: A cactus H of T and a boolean variable / G {T, F}, with / = T 

if H is isolated and J = F else, or the statement "No cactus of T 

exists" . 



1. Put H = (D, I = F, G = X. 

2. While there is some z G G do 

3. If there is a trinet G T such that z is at the bottom of A^, then do 

4. Let Y be the set of ?/ G X — {z} such that y hangs off the side of 

5. some N & T for which z is at the bottom of N. 

6. If the relation ~7- is an equivalence relation on Y 

7. that has at most two equivalence classes E, E', then do 

8. If the relation <j- on F is a partial order on Y that also restricts 

9. to give a strict linear order on E and on E' then do 

10. Let F = (ai, . . . ,ap : bi, . . . ,bq : z) and S = Y U {z}, where 

11. E = {ai, . . . , ttp} and E' = {hi, . . . , bg} are ordered relative to <7-. 

12. If r\s satisfies (H2)-(H8), then let = F and G = and, if 

13. T\s also satisfies (119) then let / = T, else let G = G — {z}. 

14. end "do (line 11)" 

15. else let G = G - {z}. 

16. end "do (line 8)" 

17. else let G = G - {z}. 

18. end "do (line 3)" 

19. else let G = G - {z}. 



20. end "do (line 2)" 

21. If = then output "No cactus of T exists" else output H and /. 

Figure 9. Pseudo-code for an algorithm that either 
finds a cactus of a dense trinet set T and also decides 
whether it is isolated or not or determines that no cac- 
tus of T exists. 

Proof. Algorithm BuildNet has run-time that is polynomial in |X| 
since the check required in line 2 can be executed in time that is poly- 
nomial in |X| by Lemmas 17.11 and 17.21 

Now, if algorithm BuildNet outputs "There is no 1-nested network 
displaying T" , then by Proposition 15. H Lemma 16.11 and Lemma 16. 2[ 
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BuildNet(T) 

Input: A set X, |X| > 3, and a dense set T of trinets on X. 
Output: A 1-nested network on X with Tr{N) = T, or 

the statement "There is no 1-nested network displaying T" . 



1. Stack = (J},G = X 

2. While there is some cherry S" in T with z & S or some cactus 

3. H = (oi, . . . ,ap : bi, . . . ,bq : z) with support S = {ai, . . . , Op, 6i, . . . , bq, z} 

4. in r do 

5. Put the symbol R^-s on the top of Stack. 

6. If \G ~{S - {z})\ < 2, then let N be either the bush on G or 

7. the two-leafed network on G, depending on T. 

8. Letr=r-%:s,G = G-{S-{z}). 

9. end "do (line 2)" 

10. If |G| > 3, then output "There is no 1-nested network displaying 7~" 

11. else do 

12. While there is some Rz.s on the top of Stack, do N = R^ si^)- 

13. Output N 

13. end "do (line 12)" 



Figure 10. Pseudo-code for an algorithm to construct 
a 1-nested network from a dense set of trinets, or decide 
that such a network does not exist. 

there is no 1-nested network N on X with Tr{N) = T. Moreover, 
if BuildNet outputs a network N , then N is clearly 1-nested, and 
Tr{N) = T by ([1]). This completes the proof. □ 

Remark 7.4. Although we have shown that algorithm BuildNet has 
run-time that is polynomial in \X\, it could be of interest to see if faster, 
more sophisticated algorithms can be developed. 

8. Discussion 

In this paper, we have shown that we can recover a 1-nested net- 
work from 'perfect data', viz. the dense set of 1-nested trinets that 
is displayed by the network. In practice, we will not usually have ac- 
cess to such information for biological datasets. Even so, it should be 
quite straight-forward to at least compute a dense set of trinets for any 
given biological dataset using existing phylogenetic network methods. 
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Figure 11. The 1-nested network on {w,x,y,z} de- 
picted in (a) is uniquely determined by the two trinets 
pictured in (b). As before, directions are omitted for 
clarity when clear. Also only the vertices that are leaves 
are marked by a dot. 



For example, given a multiple sequence alignment, one could compute 
the most parsimonious or most likely trinet for every sub-alignment of 
3 sequences (using, e.g. methods described in [IHII19]), which would be 
feasible as there are a bounded number of 1-nested trinets. Note that 
this would have the advantage that no 'breakpoints' would need to be 
computed for the multiple alignment, which is a first (and sometimes 
quite difficult) step that is usually required when constructing phylo- 
genetic networks from phylogenetic trees (cf. e.g. [121 Chapter 11], [231 
Section 2]). 

Given that computing dense sets of trinets is feasible for biological 
data, it could be reasonable to develop methods for finding 1-nested 
networks displaying as many trinets as possible from a dense set of 
trinets. Similar techniques have been developed for triplets e.g. [HI 
[131 [31], although it is worth noting that it is NP-hard to find a tree 
displaying a maximum number of rooted triplets from an arbitrary set 
of triplets [21[IS1[33] (even if the set is dense [3]). Alternatively, it might 
be of interest to investigate if there might be an 'Aho-type' algorithm 
[1] to determine if an arbitrary subset of 1-nested trinets encodes a 1- 
nested network, and, if so, adapt this to give 'Min-Cut' type algorithms 
for building 1-nested networks from sets of trinets (cf. |2¥1 [261 |2Z|)- A 
first step in this direction could be to determine whether or not it is 
an NP-complete problem to decide if an arbitrary subset of 1-nested 
trinets encodes a 1-nested network (in particular, note that there are 
non-dense sets of 1-nested trinets that encode 1-nested networks - e.g. 
the 1-nested network N on {w, x, y, z} pictured in Fig. [TTT a) is the only 
1-nested network on {w, x, y, z} displaying the two trinets presented in 

Fig.[n](b)). 

In another direction, clearly we can ask for results along the lines of 
those presented above for level-k networks [9J, k > 2, phylogenetic 
networks that have a bounded level of complexity depending on k 
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Figure 12. A level-n phylogenetic network on 
{xi,X2, . . . , Xn}, n > 4, for which every trinet in Tr{N) 
is of level-3. For clarity arc directions are omitted when 
clear. 

(and also, of course, 'fc-nested' networks). Note that there are non- 
recoverable level-2 networks (e.g. Fig. Hj), and so this could be rather 
more technical. Moreover, it should be noted that, for k > 3, there 
are level-fc networks that are not of level-(fc — 1) all of whose trinets 
have fixed level (see Fig. [T2|) . Thus, the levels of the trinets displayed 
by a network do not necessarily determine the level of a network. For 
practical purposes, it might also be of interest to determine a way to 
enumerate the level-fc trinets, k > 2. 

Another avenue worth exploring, could be to try generalizing the 
above results to 'r- nets', r > 4, i.e. phylogenetic networks with r- 
leaves (note that in case r = 4 quartet trees are commonly used to 
build phylogenetic trees, e.g. [29]). Note that it is straight-forward to 
extend Definition 13.11 to obtain a set of r-nets displayed by a phylo- 
genetic network. This could be quite useful in practice since it might 
be possible to obtain more accurate estimates for r-nets than trinets 
(at least for r = 4) before we try to piece them together, although, 
technically speaking, this could be very challenging. 

Finally, we conclude with what we consider to be a rather bold con- 
jecture: 

Conjecture 8.1. If N is a recoverable phylogenetic network on X , then 
Tr{N) encodes N, that is, if N' a recoverable phylogenetic network on 
X such that Tr{N) = Tr{N') then N is isomorphic to N' . 

A first (and probably quite instructive!) 'exercise' could be to try 
and show that this conjecture at least holds for level-2 networks. Note 
that if this conjecture were true, then as in Corollary 16.41 we would 
immediately obtain a new proper metric on the set of recoverable phy- 
logenetic networks on X. 
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